-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tolerate failure #183
base: 4.x
Are you sure you want to change the base?
Tolerate failure #183
Conversation
|
||
Common causes are: | ||
|
||
- When the hit result contains top-level fields that are {logstash-ref}/processing.html#reserved-fields[reserved in Logstash] but do not have the expected shape. Use the <<plugins-{type}s-{plugin}-target>> directive to avoid conflicts with the top-level namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karenzone can I get 👀 on this one to make sure I'm not risking breaking the docs build?
It looks like we added {logstash-ref}/processing.html#reserved-fields
in 7.7 (it doesn't exist on 7.6 and before), but is present on all of the versions of the reference available in the quick picker (master
, current
, 8.5
, 8.4
, 7.17
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we added {logstash-ref}/processing.html#reserved-fields in 7.7 (it doesn't exist on 7.6 and before), but is present on all of the versions of the reference available in the quick picker (master, current, 8.5, 8.4, 7.17).
The gemlock release file controls which plugin versions get mapped/picked up for each stack version when we run docgen. For that reason, the plugin<->stack versioning generally works itself out without any manual intervention from us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor wording tweaks inline for your consideration and addressed versioning question. Otherwise, LGTM
TL;DR. The versioning should work itself out and your linking in spot on.
|
||
Common causes are: | ||
|
||
- When the hit result contains top-level fields that are {logstash-ref}/processing.html#reserved-fields[reserved in Logstash] but do not have the expected shape. Use the <<plugins-{type}s-{plugin}-target>> directive to avoid conflicts with the top-level namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we added {logstash-ref}/processing.html#reserved-fields in 7.7 (it doesn't exist on 7.6 and before), but is present on all of the versions of the reference available in the quick picker (master, current, 8.5, 8.4, 7.17).
The gemlock release file controls which plugin versions get mapped/picked up for each stack version when we run docgen. For that reason, the plugin<->stack versioning generally works itself out without any manual intervention from us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signing off on the docs portion in case you're waiting for an approval to move forward.
da01542
to
b42442c
Compare
When SECURE_INTEGRATION is speicified, the (non-secure) `:integration` specs are excluded, so we cannot have the `:secure_integration` specs wrapped in a context flagged as `:integration`.
In order for the `ca_trusted_fingerprint` specs to work with the CA's fingerprint, ES needs to be configured to present a cert chain that includes the CA.
b42442c
to
4a9e689
Compare
When an Event cannot be created directly from the hit, or when the docinfo cannot be merged into a non-hash field in the hit, emit an Event tagged with `_elasticsearch_input_failure` that contains the JSON-encoded hit in `[event][original]` instead of crashing.
4a9e689
to
8912943
Compare
lib/logstash/inputs/elasticsearch.rb
Outdated
# @param scroll_id [String]: a scroll id to resume | ||
# @return [Array(Boolean,String)]: a tuple representing whether the response | ||
# | ||
def process_next_scroll(output_queue, scroll_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cant quite figure out how this is used.
rescue => e | ||
serialized_hit = hit.to_json | ||
logger.warn("Event creation error, original data now in [event][original] field", message: e.message, exception: e.class, data: serialized_hit) | ||
return event_factory.new_event('event' => { 'original' => serialized_hit }, 'tags' => ['_elasticsearch_input_failure']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return event_factory.new_event('event' => { 'original' => serialized_hit }, 'tags' => ['_elasticsearch_input_failure']) | |
event_factory.new_event('event' => { 'original' => serialized_hit }, 'tags' => ['_elasticsearch_input_failure']) |
Nit: Shouldnt need the explicit return, but I see that is not the only instance in this project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is technically redundant, but present for readability.
The explicit return
statement makes clear to me that we are intentionally returning that object from the rescue
... block, because otherwise when I'm reading code I don't know whether the return value matters to the caller(s).
It's a side-effect of using exceptions for flow-control (which is less than ideal, but it is what we are already starting with).
set_docinfo_fields(hit, event) if @docinfo | ||
|
||
event | ||
rescue => e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rescue => e | |
rescue RuntimeError => e |
Looks like set_docinfo_fields
explicitly raises a RuntimeError
now? May be best to be as specific as possible with error catching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this isn't the only exception we're handling here.
We know that EventFactory#new_event
(which delegates to Event::new
) can give us exceptions too, specifically when the payload uses reserved fields in incorrect ways. We want to catch any situation in which we fail to create an event from the hit.
cat es.crt ca.crt > es.chain.crt | ||
|
||
# output ISO8601 timestamp to file | ||
date -Iseconds > GENERATED_AT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like expiration date would be more useful 🤷 or both.
openssl x509 -enddate -noout -in es.crt | sed 's/notAfter=//'
|
||
event | ||
rescue => e | ||
serialized_hit = hit.to_json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming if these were not json serializable in the first place we would never get this far?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ehhh, we deserialized the hit from json, so it should be able to go back?
If we can't, we are at least not creating a new failure scenario (merely failing to handle something that currently already crashes the plugin).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test cleanup looks good (one small suggestion on cert generation script).
Release prep looks correct (version number is good and changelog looks accurate)
Had a couple nit comments and a question about potential unused code. Other than that this looks like a good improvement.
rescue => e | ||
serialized_hit = hit.to_json | ||
logger.warn("Event creation error, original data now in [event][original] field", message: e.message, exception: e.class, data: serialized_hit) | ||
return event_factory.new_event('event' => { 'original' => serialized_hit }, 'tags' => ['_elasticsearch_input_failure']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is technically redundant, but present for readability.
The explicit return
statement makes clear to me that we are intentionally returning that object from the rescue
... block, because otherwise when I'm reading code I don't know whether the return value matters to the caller(s).
It's a side-effect of using exceptions for flow-control (which is less than ideal, but it is what we are already starting with).
set_docinfo_fields(hit, event) if @docinfo | ||
|
||
event | ||
rescue => e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this isn't the only exception we're handling here.
We know that EventFactory#new_event
(which delegates to Event::new
) can give us exceptions too, specifically when the payload uses reserved fields in incorrect ways. We want to catch any situation in which we fail to create an event from the hit.
|
||
event | ||
rescue => e | ||
serialized_hit = hit.to_json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ehhh, we deserialized the hit from json, so it should be able to go back?
If we can't, we are at least not creating a new failure scenario (merely failing to handle something that currently already crashes the plugin).
When an Event cannot be created directly from the hit, or when the
docinfo cannot be merged into a non-hash field in the hit, emit an
Event tagged with
_elasticsearch_input_failure
that contains theJSON-encoded hit in
[event][original]
instead of crashing.Resolves #182